A Note on the Expectation-Maximization (EM) Algorithm

نویسنده

  • ChengXiang Zhai
چکیده

The Expectation-Maximization (EM) algorithm is a general algorithm for maximum-likelihood estimation where the data are “incomplete” or the likelihood function involves latent variables. Note that the notion of “incomplete data” and “latent variables” are related: when we have a latent variable, we may regard our data as being incomplete since we do not observe values of the latent variables; similarly, when our data are incomplete, we often can also associate some latent variable with the missing data. For language modeling, the EM algorithm is often used to estimate parameters of a mixture model, in which the exact component model from which a data point is generated is hidden from us. Informally, the EM algorithm starts with randomly assigning values to all the parameters to be estimated. It then iterately alternates between two steps, called the expectation step (i.e., the “E-step”) and the maximization step (i.e., the “M-step”), respectively. In the E-step, it computes the expected likelihood for the complete data (the so-called Q-function) where the expectation is taken w.r.t. the computed conditional distribution of the latent variables (i.e., the “hidden variables”) given the current settings of parameters and our observed (incomplete) data. In the M-step, it re-estimates all the parameters by maximizing the Q-function. Once we have a new generation of parameter values, we can repeat the E-step and another M-step. This process continues until the likelihood converges, i.e., reaching a local maxima. Intuitively, what EM does is to iteratively “augment” the data by “guessing” the values of the hidden variables and to re-estimate the parameters by assuming that the guessed values are the true values. The EM algorithm is a hill-climbing approach, thus it can only be guanranteed to reach a local maxima. When there are multiple maximas, whether we will actually reach the global maxima clearly depends on where we start; if we start at the “right hill”, we will be able to find a global maxima. When there are multiple local maximas, it is often hard to identify the “right hill”. There are two commonly used strategies to solving this problem. The first is that we try many different initial values and choose the solution that has the highest converged likelihood value. The second uses a much simpler model (ideally one with a unique global maxima) to determine an initial value for more complex models. The idea is that a simpler model can hopefully help locate a rough region where the global optima exists, and we start from a value in that region to search for a more accurate optima using a more complex model. There are many good tutorials on the EM algorithm (e.g., [2, 5, 1, 4, 3]). In this note, we introduce the EM algorithm through a specific problem – estimating a simple mixture model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Note On EM Algorithm For Mixture Models

Expectation-maximization (EM) algorithm has been used to maximize the likelihood function or posterior when the model contains unobserved latent variables. One main important application of EM algorithm is to find the maximum likelihood estimator for mixture models. In this article, we propose an EM type algorithm to maximize a class of mixture type objective functions. In addition, we prove th...

متن کامل

The Naive Bayes Model, Maximum-Likelihood Estimation, and the EM Algorithm

This section describes a model for binary classification, Naive Bayes. Naive Bayes is a simple but important probabilistic model. It will be used as a running example in this note. In particular, we will first consider maximum-likelihood estimation in the case where the data is “fully observed”; we will then consider the expectation maximization (EM) algorithm for the case where the data is “pa...

متن کامل

The Expectation Maximization Algorithm: A short tutorial

This tutorial discusses the Expectation Maximization (EM) algorithm of Dempster, Laird and Rubin [1]. The approach taken follows that of an unpublished note by Stuart Russel, but fleshes out some of the gory details. In order to ensure that the presentation is reasonably self-contained, some of the results on which the derivation of the algorithm is based are presented prior to the main results...

متن کامل

EM vs MM: A case study

The celebrated expectation-maximization (EM) algorithm is one of the most widely used optimization methods in statistics. In recent years it has been realized that EM algorithm is a special case of the more general minorization-maximization (MM) principle. Both algorithms creates a surrogate function in the first (E or M) step that is maximized in the second M step. This two step process always...

متن کامل

On Regularization Methods of Em-kaczmarz Type

We consider regularization methods of Kaczmarz type in connection with the expectation-maximization (EM) algorithm for solving ill-posed equations. For noisy data, our methods are stabilized extensions of the well established ordered-subsets expectation-maximization iteration (OS-EM). We show monotonicity properties of the methods and present a numerical experiment which indicates that the exte...

متن کامل

An Explanation of the Expectation Maximization Algorithm, Report no. LiTH-ISY-R-2915

The expectation maximization (EM) algorithm computes maximum likelihood estimates of unknown parameters in probabilistic models involving latent variables. More pragmatically speaking, the EM algorithm is an iterative method that alternates between computing a conditional expectation and solving a maximization problem, hence the name expectation maximization. We will in this work derive the EM ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004